Protein Structure Prediction : Selecting Salient Featuresfrom

نویسندگان

  • Kevin J. Cherkauer
  • Jude W. Shavlik
چکیده

We introduce a parallel approach, \DT-Select," for selecting features used by inductive learning algorithms to predict protein secondary structure. DT-Select is able to rapidly choose small, nonre-dundant feature sets from pools containing hundreds of thousands of potentially useful features. It does this by building a decision tree, using features from the pool, that classiies a set of training examples. The features included in the tree provide a compact description of the training data and are thus suitable for use as inputs to other inductive learning algorithms. Empirical experiments in the protein secondary-structure task, in which sets of complex features chosen by DT-Select are used to augment a standard artiicial neural network representation, yield surprisingly little performance gain, even though features are selected from very large feature pools. We discuss some possible reasons for this result. 1

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Protein Structure Prediction: Selecting Salient Features from Large Candidate Pools

We introduce a parallel approach, "DT-SELECT," for selecting features used by inductive learning algorithms to predict protein secondary structure. DT-SELECT is able to rapidly choose small, nonredundant feature sets from pools containing hundreds of thousands of potentially useful features. It does this by building a decision tree, using features from the pool, that classifies a set of trainin...

متن کامل

Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches

DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...

متن کامل

A Contact-Assisted Approach to Protein Structure Structure Prediction and Its Assessment in CASP10

Among different approaches to predict the 3D structure of a protein, one important idea is to predict a protein residueresidue contact map and then construct a full 3D structure from the contact-map. Instead of building a structure purely from contacts information, here we describe a contactassisted structure prediction approach that uses only a few known contacts to improve the quality of alre...

متن کامل

A Contact-assisted Approach to Protein Structure Prediction and Its Assessment in CASP10

Among different approaches to predict the 3D structure of a protein, one important idea is to predict a protein residueresidue contact map and then construct a full 3D structure from the contact-map. Instead of building a structure purely from contacts information, here we describe a contactassisted structure prediction approach that uses only a few known contacts to improve the quality of alre...

متن کامل

In Silico Prediction and Docking of Tertiary Structure of Multifunctional Protein X of Hepatitis B Virus

Hepatitis B virus (HBV) infection is a universal health problem and may result into acute, fulminant, chronic hepatitis liver cirrhosis, or hepatocellular carcinoma. Sequence for protein X of HBV was retrieved from Uniprot database. ProtParam from ExPAsy server was used to investigate the physicochemical properties of the protein. Homology modeling was carried out using Phyre2 server, and refin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993